A Data Mining Approach for selecting Bitmap Join Indices

نویسندگان

  • Ladjel Bellatreche
  • Rokia Missaoui
  • Hamid Necir
  • Habiba Drias
چکیده

Index selection is one of the most important decisions to take in the physical design of relational data warehouses. Indices reduce significantly the cost of processing complex OLAP queries, but require storage cost and induce maintenance overhead. Two main types of indices are available: mono-attribute indices (e.g., B-tree, bitmap, hash, etc.) and multi-attribute indices (join indices, bitmap join indices). To optimize star join queries characterized by joins between a large fact table and multiple dimension tables and selections on dimension tables, bitmap join indices are well adapted. They require less storage cost due to their binary representation. However, selecting these indices is a difficult task due to the exponential number of candidate attributes to be indexed. Most of approaches for index selection follow two main steps: (1) pruning the search space (i.e., reducing the number of candidate attributes) and (2) selecting indices using the pruned search space. In this paper, we first propose a data mining driven approach to prune the search space of bitmap join index selection problem. As opposed to an existing our technique that only uses frequency of attributes in queries as a pruning metric, our technique uses not only frequencies, but also other parameters such as the size of dimension tables involved in the indexing process, size of each dimension tuple, and page size on disk. We then define a greedy algorithm to select bitmap join indices that minimize processing cost and verify storage constraint. Finally, in order to evaluate the efficiency of our approach, we compare it with some existing techniques.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Yet Another Algorithms for Selecting Bitmap Join Indexes

One of the fundamental tasks that data warehouse (DW) administrator needs to perform during the physical design is to select the right indexes to speed up her/his queries. Two categories of indexes are available and supported by the main DBMS vendors: (i) indexes defined on a single table and (ii) indexes defined on multiple tables such as join indexes, bitmap join indexes, etc. Selecting relev...

متن کامل

A Constraint-based Mining Approach for Multi-attribute Index Selection

The index selection problem (ISP) concerns the selection of an appropriate indexes set to minimize the total cost for a given workload under storage constraint. Since the ISP has been proven to be an NP-hard problem, most studies focus on heuristic algorithms to obtain approximate solutions. The problem becomes more difficult for indexes defined on multiple tables such as bitmap join indexes, s...

متن کامل

Bitmap Index-Based Decision Trees

In this paper we propose an original approach to apply data mining algorithms, namely decision tree-based methods, taking into account not only the size of processed databases but also the processing time. The key idea consists in constructing a decision tree, within the DBMS, using bitmap indices. Indeed bitmap indices have many useful properties such as the count and bit-wise operations. We w...

متن کامل

Accelerating Spatial Join Operations using Bit-Indices

Spatial join is a very expensive operation in spatial databases. In this paper, we propose an innovative method for accelerating spatial join operations using Spatial Join Bitmap (SJB) indices. The SJB indices are used to keep track of intersecting entities in the joining data sets. We provide algorithms for constructing SJB indices and for maintaining the SJB indices when the data sets are upd...

متن کامل

Automatic Selection of Bitmap Join Indexes in Data Warehouses

The queries defined on data warehouses are complex and use several join operations that induce an expensive computational cost. This cost becomes even more prohibitive when queries access very large volumes of data. To improve response time, data warehouse administrators generally use indexing techniques such as star join indexes or bitmap join indexes. This task is nevertheless complex and fas...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • JCSE

دوره 1  شماره 

صفحات  -

تاریخ انتشار 2007